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A STANDARD BACTERIAL INDEX 1 
By P. V. Wells 2 and W. F. Wells 8 

1. INTRODUCTION 

For a number of years one of the authors has been struck with the 
simplicity and convenience of the logarithm as a variable in bacteri- 
ology, and has recommended its use in a number of papers. 4 The 
practical value of these suggestions has never been questioned, but 
considerable reluctance has been shown toward their adoption, due 
apparently to doubts as to their sufficient basis in statistical theory. 
In the present paper this aspect of the dilution method is considered, 
and it is shown that from every point of view the appropriate bac- 
terial index is the logarithm of the number of bacteria per liter. 

The reason for the dilution method is the wide variability in bac- 
terial populations. In investigating such distributions the observer 
must consider not only errors of measurement which change the 
value of the variate observed, due to imperfect apparatus and tech- 
nique, but also those fluctuations in frequency which occur quite 
independently of the precision with which the variates are measured, 
which are errors of sampling. Only those variations in excess of 
errors of measurement and of sampling are characteristic of the 
bacteria themselves, and indicate local conditions and their changes 
with time. 

If the bacteria are distributed at random throughout the sample 
of water, and the inoculations are made independently so that the 
variates are not correlated with each other, the theory of probabilities 
is able to predict the fluctuations of sampling. The mere presence 
of bacteria in a given dilution indicates a probable lower limit, to the 
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number of bacteria in the original sample; a series of increasing dilu- 
tions, however, including both positive and negative tubes, is a com- 
bination of events which indicates very definitely the probable num- 
ber. Unfortunately the errors of sampling are much larger than 
when the number of colonies is actually counted. But as it is the 
only method available for the estimation of B. coli, working rules 
for obtaining the bacterial index and its standard error of sampling 
are presented. 

In the method of bacterial counts the error of sampling of a single 
plate is less than 30 per cent when the count is above 10 colonies per 
plate, and so is of no practical importance. While the bacterial index 
is found to be obviously the most convenient variable in the dilution 
positive method, the necessity of taking logarithms is a disadvantage 
in the method of counts. A consideration of the errors of measure- 
ment, however, shows that this labor is more than justified by in- 
creased stability in the averages. The bacterial index is therefore 
the appropriate variable for both methods. 

A study of observed bacterial distributions in space and time 
amply confirms the wisdom of choosing the logarithm as a bacterial 
index. Distributions of this variable found in nature are approxi- 
mately normal, so that discussion of their properties is much simpli- 
fied. Moreover the index gives a more compact scale for routine 
work and the results are more easily correlated with other phenomena. 
But in spite of all these advantages some may feel that the number 
of bacteria is really a more fundamental and natural quantity, and 
that somewhow or other the use of the index is a contravention of 
statistical method; for the average index is the logarithm of the geo- 
metric mean number of bacteria, and the mere mention of this strange 
average gives rise to misgivings in their minds. To dispel such 
notions is not easy without entering into the maze of statistical 
theory. But an attempt will be made to explain as simply as pos- 
sible enough of the theory of averages to justify the adoption of the 
bacterial index as a standard variable in bacteriology. 

2. FUNDAMENTAL THEOREM 

The first interpretation of fermentation tube results from the 
standpoint of random sampling was made by McCrady. 6 This 

•McCrady, Jour. Infect. Diseases, 17, 183, 1915; Pub. Health Jour., 9, 201, 
1918. 
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was followed by the more complete study of Greenwood and Yule, 8 
and a series of interesting papers by Stein. 7 The theoretical basis 
of their work is a theorem in probabilities first stated by the 
great French mathematician Poisson in 1837, which has often been 
called the "law of small numbers," or the "law of small chances." 
This law applies wherever rare individuals are distributed at random 
in a very large population, and sufficiently large samples are taken 
to include a few of the exceptions. It might be called the "law of 
exceptions in a crowd." As the bacteria are extremely rare com- 
pared with the number of equivalent water "particles" in which they 
are diluted, there can be little question of the applicability of "Pois- 
son's exponential limit." It states that the chance (C B ) of (B) 
bacteria occurring in a sample containing (2) bacteria when uniform 
is 

C B = -exp(-z) (l) 

where the symbols denote the exponential (exp. (— z) = e— z) and the 
factorial (B\ = L B ). 

Poisson's law has been applied in a great variety of cases (usually 
in ignorance of the others), such as the frequency of rare diseases, the 
counting of blood corpuscles, the counting of a-particles emitted by 
radium, etc. A careful study of this law in Karl Pearson's laboratory 
by Whittaker 8 has revealed a number of unsolved questions, but for 
our problem the validity of equation (1) is quite sufficient. 

In applying the fundamental theorem to the interpretation of the 
presence of bacteria and to determine the fluctuations in sampling, 
it must be remembered that actual conditions are far from the ideal. 
Instead of being distributed individually at random throughout the 
water as assumed by the theory, bacteria may and probably do grow 
in colonies and adhere to suspended particles. The experimental 
procedure also tends to interfere with the random distribution, for 
the dilutions are made in steps and thus are not separate samples. 
The identification and counting present still further problems. 

Thus, while the theory of probabilities as applied to this case as- 
sumes perfect irregularity and independence in the "contributory 

• Green-wood and Yule, Jour. Hygiene, 16, 36, 1917 

'M. F. Stein, Am. Jour. Pub. Health, 11, 820-9, 1918; Jour. Bacteriology 4, 
243-65, 1919; Eng. News Record, 82, 1106-9, 1919. 

' Lucy Whittaker, Biometrika, 10, 36-71, 1914. The function is tabulated in 
Pearson's Tables (51). 



A STANDARD BACTERIAL INDEX 505 

cause-groups" (for curiously enough these very cases are the sim- 
plest), there can be little question that bacterial results are somewhat 
correlated. But it is in this borderland between perfect independence 
and perfect causality, i.e., in the field of correlation, that the phenom- 
ena defy analysis. It is hardly worth while to introduce these 
complications until the simpler cases are well understood. The 
theory can be regarded as a first approximation, giving the general 
march of the phenomena, just as it is applied to slightly "loaded" 
dice or to imperfect cards in other games of chance. The value of 
mathematics lies in the clearness of its reasoning, but its very exact- 
ness excludes from practical discussion much of the complexity of 
nature. 

3. THE BACTERIAL INDEX 

The usual procedure in bacteriology, when plates are not counted, 
is to record simply the presence or the absence of colonies at each 
dilution. Observations "bacteria present" are marked positive (+), 
those "no bacteria present" are marked negative (— ). The dilu- 
tions are made on a geometric scale. This "dilution scale" can be 
formed merely by assigning serial numbers 1, 2, 3, 4, 5, 6, etc., to 
the dilutions containing 100 cc, 10 cc, 0.1 cc, 0.01 cc, 0.001 cc, 
etc., of the original sample in the inoculation; the "dilution" (d) 
represents the common logarithm of the number of bacteria per 
liter in the original sample for each bacterium in the dilution. The 
bacterial index (x), or logarithm of the number of bacteria per liter, 
is therefore 

x = d + log z (2) 

where (z) is the number in the inoculation at dilution (d). For 
example, one bacterium in 0.001 cc. of the original sample (z — 1, 
in d = 6) gives an index x = 6, or 10 6 bacteria per liter. 

The chance (C ) of observing no bacteria in the inoculation is, 
by (l), 

Co = exp (- z) (3) 

and the chance of a positive tube is (1 — C ). The chance of dilu- 
tion, (d) being positive and (d + 1) negative, is the product of their 
separate probabilities, if the dilutions are independent. Thus the 
probability (C + — ) of the series 



^' d + 1 )isC + _=[l-exp(-,)]exp(-^ 



(4) 
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and the most probable value of (x) is x m = d + 0.38, for 2 = In 11 
is the mode of (4). But the observed series should extend through 

all dilutions, giving (. . . + + H ), the last positive 

dilution being at (d). This dilution is then called the "dilution 
positive" (d +). The probability (Cd+) of this series is 

Cd+ - [1- exp (- 10*2)] [1- exp (- lOz)] [1- exp (- 2)] 

eXp (-^) eXp (i|-») (5) 

The mode of this function is z = In 10.1, so that the most probable 
value of the bacterial index (x m ) is 

x m = d+ + 0.364 (6) 

In words, if samples of water of various bacterial contents be ob- 
served to give the endless series of dilutions in which all dilutions less 
than and including (d+) are positive, all greater negative, that sam- 
ple for which x — d+ + 0.364 will be observed most frequently. 
This is the correction to be applied to an observed "dilution positive" 
(d+) to get the most probable bacterial index (x m ). 

When the test by a series of dilutions is repeated many times upon 
the same sample of water, the dilution positive (d + ) is found to vary 
from test to test, so that the correction (6) is not of much use in 

practice. Moreover the most frequent series (+ + H ) is not 

the only type observed. "Skips" sometimes occur, such as (+H — 

H ); it seems logical to "revert" these after the manner of 

Phelps, that is permute (— ) and (+) to give the regular order 
(+++ ). 

One mode of procedure is to develop a general expression like (5) 
with the percentages positive and negative as unknowns (see equa- 
tion 14) and then solve for the most probable value (x m ). This is 
feasible when the number of dilutions considered is limited, and we 
shall use it later (article 7) for incomplete tests; but the solution is 
too complicated in the general case. 

4. THE AVERAGE BACTERIAL INDEX 

It is often simpler to find the average value by means of the 
probability function than to find the mode, or most probable value. 
Suppose the test by a series of dilutions is repeated many times upon 
the same sample of water, of given index (x), what average dilution 
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positive is to be expected? Expressions similar to (5) give the 
chances of any given dilution positive (d+). Now these probablities 
are the frequencies in very large series of tests, so that the average 
dilution positive is formed simply by adding the products of each 
(d+) by its frequency (C&+). This quantity (d+) is found to differ 
from the bacterial index (x) merely by an additive constant, thus 
leading to a working rule of remarkable simplicity. 

Taking the dilution (d) as origin, with (n) larger dilutions (d + 1, 
d +2, . . .d + ri), and n smaller, there are (2 n + 1) dilutions in the 
series. Now these dilutions may be all positive, all negative, or any 
other arrangement of positive and negative, in all 

T (2n + 1)» (2n + l)l (2n + l)l (2n + l)M 

[_(2» + l)! 2n!l! (2»-l)!2I (n + l)!n!J 

different permutations. Working out the probabilities of each of 
these, after the manner of (5), multiplying each by its dilution posi- 
tive, and adding, the bacterial index (a;) is obtained as a function of 
the average dilution positive (d+). By inductive reasoning the 
general expression is found to be 

»-2f+[Ct*" + Ci&+... +C.«+ C.+C 10 +C„ 10 ' + ...+C. l0, +Iog*-n] (7) 

where x — d + logs, and C = exp (— z). 

Fortunately the correction term in brackets is practically constant. 
The number of dilutions (2» + 1) need not exceed the number of 
significant figures to which the probabilities are computed ; but (n) 
can always be taken very large, for increasing (n) merely adds unit 
and zero terms, and the unit terms exactly cancel the increase in (n). 
Now the correction has the same value for all multiples of (z) by a 
power of 10, for the change in the unit terms is exactly compensated 
by the change in (log z) . It is therefore necessary to consider values 
of (z) between 1 and 10 only. A few values are shown in table 1. 
The correction is very nearly constant; the one to be chosen is ob- 
viously that corresponding to the most probable value of (z), namely: 
z = In 10.1 = 2.313, which gives 

* = 5+ + 0.231 (8) 

The correction will not often exceed 0.24, and cannot exceed 0.27. 

The fluctuations of sampling a single specimen of water can be 
eliminated, therefore, by taking the average dilution positive 
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reverting skips) and adding + 0.24. In actual practice, however, a 
single specimen is not of sufficient importance to warrant such effort. 
Each sample tested is from a different locality, and at a different 
time. An average of the bacterial contents is desired. Does the 
same correction to the average dilution positive of such space and 
time distributions give the average bacterial index? A little con- 
sideration shows that this method of averaging compensates for the 
fluctuations in sampling in one case as in the other. The compensa- 
tion would be complete if sufficient samples were taken and the 
correction (7) did not vary with (z), for then the deviations would be 
simply additive. But we have shown in table 1 how nearly con- 
stant the corrections are. Moreover, the correction for the mode 
is a mininum, so that it is changing most slowly just where the values 
of (z) are most numerous. It is therefore reasonable to assume that 
the correction will rarely exceed + 0.24. 

TABLE 1 



z = 

x = d + +. 



1 


2 


3 


4 


5 


6 


7 


8 


0.262 


0.233 


0.235 


0.247 


0.258 


0.265 


0.267 


0.267 



9 
0.265 



The proper working rule is thus: Add \ dilution {or + 0.24) to 
the observed average dilution positive (d+) to find the average bacterial 
index (x) of any distribution. 

5. THE STANDABD ERROR OF SAMPLING 

To illustrate the method by which (8) is reached, a series of three 
dilutions is worked out numerically in table 2, choosing z = 2.313. 

From table 2 an interesting conclusion can be drawn as to the 
fluctuation of sampling in the dilution positive method. In the last 
column the second moment is computed, giving for the standard 
error of sampling of the dilution positive a = 0.523. A similar com- 
putation for z = 1 gives x — d+ + 0.258 and <r = 0.589. A good 
figure for the standard error of sampling of the dilution positive is 
therefore 

<r = 0.55 (9) 

The standard error of sampling of the average dilution positive of 
N tests is therefore 

0.55 



ff* 



Vn 



(10) 
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TABLE 2 



OBSEBVED SERIES DILUTIONS 


FRB- 


DILUTION POSITIVE 


d 


d + 1 


d + 2 


C 


d+ 


d+XC 


d' + XC 








per cent 




per cent 


per cent 


+ 


- 


— 


69.9 


d 






+ 


+ 


- 


18.2 


d + 1 


+18.2 


18.2 


— 


— 


— 


7.7 


d-1 


-7.7 


7.7 


— 


+ 


— 


2.0 


d 






+ 


— 


+ 


1.6 


d + 1 


+1.6 


1.6 


+ 


+ 


+ 


0.4 


d + 2 


+0.8 


1.6 


. . 




+ 
+ 


0.2 
0.0 


d 

d + 1 






+ 


+ 


+20.6 


29.1 












-7.7 


.017 




+12.9 


<r* = 0.274 




100.0 


d+ = d 


+0.129 


<r - 0.524 



Bacterial index (log. bacteria per liter) x = d + + 0.235 

Thus apart from any errors of measurement, a standard deviation 
must exceed 0.55 to indicate any significant variation in the bac- 
terial distribution. 9 

How do these ideal conditions compare with actual experience? 
The most extensive series of tests are those compiled by Wolman 10 
for the distribution of B. coli in tap water. From his data we get 
the following results: 

TABLE 3 

B. coli in tap water 



TEAB 


NUMBER TESTS 

(N) 


AVERAGE DILU- 
TION POSITIVE 

G + ) 


INDEX 


STANDARD 
DEVIATION Or 

SINGLE TEST 

Ki/e) 


1917 

1918 
1917-1919 


267 
202 
614 


1.44 
1.24 
1.23 


1.68 
1.48 
1.47 


1.23 
1.08 
0.81 



The standard deviation (<r N ) in the last column is multiplied by \/6 
because each of his variates is an average of 6 tests. Now if the 

* The ratio of the standard deviation to the standard error of sampling is 
sometimes called the "Lexian ratio." The random sampling of independent 
variates is called by Yule "simple sampling," to distinguish it from the ran- 
dom sampling of correlated variates. 

10 A. Wolman, this Journal, 7, 927-930, 1920. 
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B. coli index for tap water be kept below 1.47 with a permissible 
standard deviation of 0.81, over half of this deviation may be ac- 
counted for by the standard error of sampling 0.55. This illustrates 
the significance of such fluctuations in the estimation of B. coli. 

Wolman found the cumulative diagram to be a straight line when 
plotted on logarithmic probability paper. This shows that the dis- 
tribution of the index is normal or the law of error. The correspond- 
ing distribution for the variable (X), the number of bacteria per 
liter, is therefore normal geometric, and the geometric mean ( r X), 
which is simply the antilogarithm of the average bacterial index (x), 
is the median of the X-distribution. It will be shown later that the 
variability (V = <t x /x) of normal geometric variation is given by 
the relation V — \/exp(5.30 <r x 2 ) — 1, so that the last row of table 3 
gives X = 30 B. coli per liter with a variability of 560 per cent. 
The corresponding variablity due to sampling is 200 per cent from 
(9). It is evident what a large factor should be reserved in all such 
estimates of bacterial content. 

From this point of view the dilution scale and the index have the 
obvious advantage of uniform coarseness, giving no false impression 
of precision and wasting no figures in expressing large numbers. 
For example, an index 10 represents a bacterial content which is 
rarely reached even in the worst sewage, while an index 1.5 represents 
tap water. This convenient geometric scale is provided in the sim- 
plest manner imaginable, merely by assigning serial numbers 1, 2, 3, 4, 
etc., to the successive dilutions containing 100 cc, 10 cc. 1 cc., 0.1 cc, 
etc., of the original sample, and then observing the average dilution 
positive. 

6. INTERPRETATION OF THE DILUTION POSITIVE METHOD 

It is interesting to note that much simpler conditions lead to about 
the same correction as (8). Thus ignoring all dilutions except (d) 
and (d + 1), assume that the chance of the former being positive 
equals the chance that the latter is negative. The condition is 
1 — exp(— z) = exp(— 0.1 z), which gives the solution x = d + 
0.255. The same equation expresses the condition that either (d) 
or (d + 1) must be positive; it also follows if one of these dilutions 
must be negative. All three of these conditions are logical and con- 
sistent, but an examination of table 2 shows that they ignore too 
many facts to have much weight. 
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It may be remarked that taking the "dilution positive" as variate 
is equivalent to forming a weighted average of all the dilutions, 
assigning to the dilution (d) the weight (— AP = P& — Pd+i). 
Here (P d ) is the frequency of positive tests in dilution (d), (Pa+i) is 
this frequency in the next dilution. Since S(— AP) = 1, we have 
for the weighted average dilution 

5+ = id (- dAP (li) 

— oo 

Now we can see the effect of changing the dilution interval, for when 
Ad — » 0, this weighted average becomes 



a+= - j d %d Sd - -J exp <-•>[— dTo] 5 * 



(12) 



since d — x — log 2, and 1 — P = exp(— z). Integrating, 

x = d+ - 0.242 (13) 

showing that the correction decreases with the dilution inter- 
val, vanishing at about half-dilutions, and becomes negative for 
infinitesimal intervals. 

It is also evident from (12) that averaging dilutions (d) instead of 
(x) is neglecting the factor log s, and that it is the weighted average 
of this neglected factor which gives the nearly constant correction 
+ J, when Ad = 1. But one must adopt some arbitrary system 
of weighting, for dilutions which are all positive or all negative can- 
not all possess the same significance. Stein 11 would discard all dilu- 
tions for which (P) is above 85 per cent or below 30 per cent. The 
simpler procedure of averaging the dilutions positive accomplishes 
the same result. He also recommends at least 10 tests in forming 
an average. By (10) this reduces the standard error of sampling 

0.55 
of the mean index to /— r = 0.174, which corresponds to a variability 

in the mean of normal geometric variation of 42 per cent. 

In practice the method of weighting indicated by (11) is very 
simple as is shown in the example, table 4. The results of tests on 

11 M. F. Stein, Jour. Bacteriology, 4, 243-265, 1919. 
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(iV = 10) samples are recorded in the columns numbered 1 — 10, each 
tube opposite its dilution shown in the first column. Counting the 
number of positive tubes in each dilution, take first differences 
(—NAP), then the partial moments (— dNAP) and second moments 
(- cPNAP), etc. 











Examp 


e 


TABLE 4 

of dilution positive method 






DILUTION W) 


l 


2 


3 


i 


5 


6 


7 


g 


9 


10 


(iVP) 


(- NAP) 


(- dNAP) 


(- d'NAP) 


3 

4 
5 
6 

7 


+ 
+ 


+ 

+ 


+ 


+ 
+ 

+ 


+ 
+ 


+ 
+ 


+ 
+ 


+ 
+ 

+ 
+ 


+ 


+ 

+ 
+ 


10 
7 
3 
1 



3 
4 
2 
1 



9 

16 

10 

6 




27 
64 
50 
36 





AT = 10 


41 177 




d+ =4.1 


168 

a 1 = 0.9 
<r = 0.95 

715 = - 30 



Average bacterial index s % = 4 .34 rfc 0.30 

Geometeric mean bacteria per cubic centimeter s "x = 22 =*= 78 per cent 



7. INCOMPLETE TESTS 

The average dilution positive cannot be applied to very pure waters 
in which less than 37 per cent of the tubes are positive in dilution 1 
(containing 100 cc. of the sample), for it is inconvenient to complete 
such tests by including inoculations of 1 liter, 10 liters, etc., until the 
"dilution positive" is located. Such cases, moreover, will become 
more important as the standards of purity for water supplies are 
improved. It is fortunately a very simple matter to compute the 
most probable number of bacteria for such incomplete tests. 

Suppose that a sample of water containing (X) bacteria per liter 
be tested, taking (v u t> 2 . v z ,), etc., liters as inoculations, and making 
(N) tubes at each of these volumes, with results (PiN) positive, 
(QiN) negative in («0, (PzN) positive, (Q^N) negative in (t> 2 ), etc. 
Then the probability of this result is, after the manner of (5), 

C=exp(-Xt; l Q 1 i\r)[l-exp(-Xt.i)] p ' N exp(-X« 2 Q 2 Ar)[l-exp(-X!; i! ] p ' N (14) 
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It will be noticed that this is exactly the integrand of Greenwood 
and Yule's 12 general expression. The most probable value of (X) is 
therefore given by the maximum condition 

Sffl-S-.Vi-O (15) 



*(-£>' 



where Mj «s 1 — exp(— Vi) is the most probable fraction positive 
in tubes containing volume (Vi), while (Pi) is the fraction actually 
observed. The mean of this ratio, weighted by the (v 's), is there- 
fore unity. 

This expression is unworkable in the case of a complete test; but 
suppose that three volumes only are taken, the 100 cc, the 10 cc, 
and the 1 cc, giving the fractions positive (Pi), (Pt), (P»), respec- 
tively. Assume, moreover, that Pi< 37 per cent. In this case we 
get the very simple approximate expression 13 

X m = 9(P, + P» + P,) (16) 

The most probable number of bacteria per liter (X m ) is therefore simply 
nine times the sum of the fractions positive in dilutions 1, 2, and S 
(100 cc. 10 cc., 1 cc.), when the fraction positive in dilution 1 is less 
than 25 per cent. 

When the fraction positive in dilution 1 is greater than 37 per cent, 
the most probable value for dilution is 100 per cent positive and so 
we may apply the ordinary method as in table 4. In other words, 
the test may be made complete by taking dilution 0, 100 per cent 
positive. The most probable value for the bacterial index (x m ) is 
simply 

x m = log [100 (Pi + Pi + Pa)] - 1 .05 (17) 

Applying (16) and (17) to Greenwood and Yule's two examples, we 
get immediately the results in table 5. 

> 2 Greenwood and Yule, Journal of Hygiene, 16, 36-54, 1917-1918. Our de- 
velopment is similar to theirs, but we arrive at a simple working rule. 

13 The derivation of this expression is rather long. Taking e» = 10v», 
»i = 100»i, we get, neglecting higher powers of X m v t , 

X --'lfl 2(Pi + P 2 + P, 1 

m v, [_ 222 + 9 p i + 99 p * + mP * J 

which reduces to (16) when v, = 10 -3 liters, and P! < 0.37, etc. (It is fairly 
simple to check this for two dilutions.) 
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The simple formula gives values far more precisely than is needed, 
considering the very large errors of sampling to which the method 
is susceptible. As the values of (Pi) increase the approximation 
becomes poorer until at 25 per cent the errors of (8) and (17) are 
about equal. At Pi = 25 per cent (17) gives x m = 2.5, the correct 
value being X = 2.9, while (8) gives X = 3.3 bacteria per liter, so 
that the errors in the formula (16) never exceed 13 per cent. For 
values of (Pi) above 25 per cent it is better to use (8). At 37 per 
cent to assume that Po =» 100 per cent introduces no error in the 
average dilution positive. 

TABLE 5 

B. coli in very pure water 



DILUTION 

<fl 



SOURCE A 

N -338 



Number 
positive 



30 

5 

. 2 

37 = 0.109 
338 X 9 



X m = 



0.981 



Values of Greenwood and Yule = .965 



Bacterial index m x„ 



= -0.02 



SOURCE B 

JV-333 



21 
6 

4 

31 = 0.0931 
333 X 9 



0. 838 per liter 



0.838 
-0.08 



Now suppose that it is not desirable to use as much as 100 cc. for 
an inoculation but that the three dilutions 2, 3, 4 (10 cc. 1 cc, 0.1 
cc.) are included in the test, and that P 2 < 25 per cent. The most 
probable bacterial index is then 

x m = log [100 (Pi + P, + P t )] - 0.05 (18) 

or the most probable number of bacteria per liter is roughly equal 
to the sum of the percentages positive in the three dilutions. It is 
always preferable, however, to include inoculations large enough to 
get at least 37 per cent positive tubes, for the errors of sampling are 
probably much larger when the tests are incomplete. 



8. THE METHOD OP BACTERIAL COUNTS 

Whenever it is possible to count the actual number of colonies 
at each dilution, as in the plating method, the errors of sampling can 
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be reduced to negligible proportions. The fundamental theorem 
(1) gives directly for the arithmethic mean bacterial count (J5) 



B ■ SB (BC B ) = z 
i 

and for the standard deviation (<r B ) from it 



a a ^ZB(B- 



B)* C B - z 



(19) 



(20) 



The variability (V) of sampling is therefore 7 »<tb/J3 = l/2?Vz > so 
that for plates with z = 10 colonies the variability is 30 per cent, 
while for z = 100 this is reduced to 10 per cent. These figures are 
halved for averages of four plates, and are reduced by the factor 
1/Vjv for averages of N variates. Obviously fluctuations of sam- 
pling are of little moment in the method of bacterial counts. 

The average number of bacteria per liter is simply calculated by 
the formula X = DB, where (B) is the count at dilution D a antilog 
d = 1/v, ((v) is the volume of the inoculation in liters). To form the 
average bacterial index {%), the logarithm of each count must be 
taken. That is, 

x = d + b, where b = log B (21) 

For this a two-place table is sufficient (table 6). 

TABLE 6 
Two-place logarithms 








l 


2 


3 


4 


5 


6 


7 


8 


9 


1 


00 


04 


08 


11 


15 


18 


20 


23 


26 


28 


1 


30 


32 


34 


36 


38 


40 


42 


43 


45 


46 


3 


48 


49 


51 


52 


53 


54 


56 


57 


58 


59 


4 


60 


61 


62 


63 


64 


65 


66 


67 


68 


69 


5 


70 


71 


72 


72 


73 


74 


75 


76 


76 


77 


6 


78 


79 


79 


80 


81 


81 


82 


83 


83 


84 


7 


85 


85 


86 


86 


87 


88 


88 


89 


89 


90 


8 


90 


91 


91 


92 


92 


93 


93 


94 


94 


95 


9 


95 


96 


96 


97 


97 


98 


98 


99 


99 


00 



The geometric mean number of bacteria r X = antilog x) which 
corresponds to the average index is smaller than the arithmetic 
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mean (X), and for moderate variabilities is given by the approximate 
relation 14 

T-JT-ij (22) 

Now the most probable bacterial count (B m ) is found from Pearson's 
slope relation to be B m — z — \. Comparing (19), (20), and (22), 
we have 

X = D (z - |) = X m (23) 

so that the mode of Poisson's exponential limit is closer to the geometric 
mean than to any other simple average. This disposes of any theoreti- 
cal objections to the use of the geometric mean count as the most 
probable number of bacteria in a given sample. The variability is 
so slight, however, that any difference between the arithmetic and 
geometric means must be due either to experimental errors or to 
the bacterial distribution in space or time. 

9. THE STABILITY OF AVEBAGES 

We have examined three experimental procedures, obtaining simple 
working rules for the bacterial content (1) from the average dilution 
positive, (2) from the total per cent positive in three dilutions and 
(3) from the bacterial count itself. In the two latter cases the 
number of bacteria is more simply computed than is the bacterial 
index. In the first case only is the index found with less labor. But 
a study of the errors of measurement shows the real reason for pre- 
ferring the average index, and its antilogarithm the geometric mean. 

The essential property of an average, that which makes it more 
useful than any individual observation, is its stability. Its value 
must be representative of all the data, not dominated by a few. Now 
in any case where the data vary widely a few very large values, 
representing perhaps less than one per cent of the results, may halve 
or double the value of the arithmetic mean; their effect upon the 
geometric mean is but slight. This renders the arithmetic mean 
practically valueless in distributions of wide variability. 

This point can be seen immediately from a consideration of the 
frequency curve. As Pearson has shown, the ordinate representing 
the arithmetic mean passes through the center of gravity of the 
frequency curve. Now in cases of wide variability the curve runs 

14 See Yule, Theory of Statistics, 3rd ed. p. 156, London, 1916. 
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out to enormous values, in a very long "tail." The lever-arm of 
these extreme values is tremendous, so that one such result dominates 
hundreds of ordinary ones. The value of the arithmetic mean is 
determined not by the large numbers of usual results but by the 
few chance "huge" ones. The procedure of discarding arbitrarily 
all observations of this class is not so satisfactory as the determina- 
tion of the geometric mean, which gives each observation equal 
weight on a proportional scale. 

It is difficult for one habituated to the small variations of the 
physical sciences to appreciate the effect of wide variation on the 
arithmetic mean. Take the following illustration from an actual 
bacteriological distribution. 15 



TABLE 7 



OBSERVATIONS 
PER 1000 

AJV 


BACTERIA FEB LITER 
X 


XAN 


logX 

X 


xAN 


1 


10 


10 


1 


1 


29 


100 


2,900 


2 


58 


120 


1,000 


120,000 


3 


360 


452 


10,000 


4,520,000 


4 


1,808 


311 


100,000 


31,100,000 


5 


1,555 


79 


1,000,000 


79,000,000 


6 


474 


8 


10,000,000 


80,000,000 


7 


56 



_2XAiV = 194,742,910 _ SzAAT = 4,312 

Arithmetic mean X = 195,000, geometric mean X = 20,500 

Same, omitting last 8 observations^ •= _ ... ' 

1(in N) 0.8 per cent 
(in X) 41 per cent 
(in X) 4.9 per cent 

By omitting the last eight observations (in one thousand) the arith- 
metic mean is reduced over 40 per cent, more than eight times the 
change in the geometric mean. A glance at the last column shows 
that the geometric mean is not much affected by where the "tail" 
is cut off, whereas there is no indication that even one thousand 
observations are enough to determine the arithmetic mean. The 
same point is shown graphically in figure 1, the total areas represent- 
ing first moments. The arithmetic mean is quite indeterminate. 

15 Sir A. C.Houston, Journal of Pathology and Bacteriology, 18, 351, 1913-14. 
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One may reply that a single value cannot be expected to represent 
such a wide distribution. This is perfectly true, the variability is 
here perhaps the most important feature. But the lack of stability 




Fig. 1 

of the arithmetic mean is even outdone by the instability of the 
standard deviation, and it is still worse for the higher moments. 
These constants which the Pearson school of statistics has taught us 
to regard as the most important characteristics of the distribution, 



A STANDARD BACTEEIAL INDEX 519 

cannot be determined with precision even from a thousand observa- 
tions. Pearson has called attention to the necessity of extremely 
large numbers of observations in order to overcome the fluctuations 
of sampling, but apparently no one has appreciated the immense 
superiority of the geometric mean in this regard. 

In using the arithmetic mean each variate should be measured 
with the same absolute precision, for errors in the mean are simply 
(1/iVyth of the errors in the variate. This requires a variate 
1,000,000 to be known to 6 significant figures if the variate 1000 is 
known to 3 figures. This is all right in a bank, where a check for a 
million dollars must be recorded to the last penny. But what is the 
case in bacteriology? It is precisely the large varieties which are the 
least reliable, not only absolutely but even proportionally, for they 
have been subjected to more dilutions. An absolute error of 1000 
bacteria in the variate 1,000,000 is more probable than is an error of 
1 bacterium in the variate 1000. Taking the arithmetic mean is 
equivalent to treating the better variate as an error of the worse 
variate. Yet bacteriologists hesitate in adopting the geometric 
mean. 

The point is simply stated mathematically. The same propor- 
tional error in the variate (X) produces the proportional error 
(AX/X) in the arithmetic mean (X) and the proportional error 

A'X X AX 

?-x-T (24) 

in the geometric mean (X). The same effect is illustrated graphi- 
cally in figure 2, showing the percent error in the means resulting 
from a given per cent error of frequency in the variate (X). The 
superiority of the arithmetic mean for small variates is of little im- 
portance compared with its grave defects for very large ones. The 
harmonic mean (the reciprocal of the average reciprocal) is even 
less affected by errors in large variates than is the geometric mean, 
and would form an interesting index, leading to expressions of 
"volumes contain 1 bacillus," "dilutions (D)" etc. But the har- 
monic mean overemphasizes negative results and the reciprocal pos- 
sesses inconveniences as a variable. 

Suppose the same absolute error to be made in each variate, in- 
creasing each (X) to (X + e). The arithmetic mean is increased 
the same amount (e), but the geometric mean is increased more pro- 
portionally. If each variate be increased by the same proportion 
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(X (1 + e)), the arithmetic and the geometric mean are both increased 
in the same proportion. But if larger variates are increased in 
greater proportion, the proportional increase in the arithmetic mean 
is greater than that in the geometric mean, and this superiority of 
the geometric mean increases rapidly with the variability. This is 
precisely the case in bacteriology, for even proportional errors of 
measurement are larger in the larger variates. 

The stability of the geometric mean is very real even in extensive 
series of samples. Its greatest utility, however, is when few samples 




Varlate Z 

Fio. 2. Ebbob in Mean Due to Fluctuation in Fbequency 

are taken. In actual practice it will be found that significant 
tendencies and relations can be discovered in geometric means when 
the arithmetic means are hopelessly erratic. Or, from another point 
of view, the same results can be obtained from fewer observations. 
The stability of the geometric mean extends in like measure to its 
logarithm, the average index. Indeed the chief value of this variable 
is that it avoids one computation in routine work. Why should one 
bother to find (X) when (x), a more convenient variable, is at hand? 
Of course, in reports and popular illustration the geometric mean 
should be used; it can be called simply the "usual number of bacteria 
per liter." But in graphs the index is almost necessary. 
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10. THE CHOICE OF AVERAGE 

There has been considerable reluctance on the part of bacteri- 
ologists to using the geometric mean, not on any "practical" grounds, 
but because they fear it as a contravention of statistical theory. In 
fact, the elegant systematic methods of Pearson appear to exclude 
any average but the (arithmetic) mean. The variates (observations) 
are grouped, the frequency distribution formed, and the moments 
give the mean, standard deviation, etc., in the simplest manner. But 
the choice of character to be observed is quite arbitrary, a fact usually 
passed over in silence. Considerations of convenience dictate this 
choice, and such considerations may likewise indicate a change of 
variable at the very start of the investigation. The simple average 
in this new variable corresponds to quite a different type of average of 
the variates. If the logarithm is used as variable, the variate average 
is the geometric mean. The geometric mean is therefore perfectly 
consistent with Pearson's theory. 

The fact that averages change with change of variable completely 
alters our point of view with regard to them. The choice of variable, 
as of observed character, is arbitrary, a matter of convenience, This 
cannot be too strongly emphasized, because many scientific workers 
have a vague notion that the (arithmetic) mean is ordained by the 
"laws of chance." If chance ordains any average, it is the median, 
for this is the only average which retains the same significance re- 
gardless of the variable chosen. Indeed Laplace, 16 the greatest 
authority in this subject, chose the median as fundamental, in spite 
of the fact that previous mathematicians had chosen the mode, or 
most probable value. But the mode changes with changing variable 
as do all averages except the median. 

The problem is as old as Galilei, 17 who was called in by his Floren- 
tine friends to answer the following question: a horse is really worth a 

18 Laplace, Theorie Analytique des Probabilities, Nat. Ed. Livre, II, no. 23, 
p. 340. "Des geometres celebres ont pris pour le milieu qu'il faut ehoisir 
celui qui rend le resultat observe le plus probable, et par consequent l'abscisse 
qui repond a la plus grande ordonnee de la courbe; mais le milieu que nous 
adoptons (median) est evidemment indiqu6 par la theorie des probability. " 
To Laplace the median was the most advantageous value, because it rendered 
the sum of the deviations (taken without regard to sign) a minimum. The 
arithmetic mean renders the sum of the squares of the deviations (from it) a 
minimum. Both criteria are arbitrary. 

17 See I. Todhunter, History of the Theory of Probability, p. 5, Macmillan, 
1865. 
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hundred crowns. One person estimated it at ten crowns, another at a 
thousand. Which is the more extravagant estimate? He pronounced 
the two estimates equally extravagant, anticipating Fechner's law 
of sensation. The priest Nozzolini pronounced the higher estimate 
the more extravagant — sound business practice. This apparently 
trivial question illustrates perfectly the usual ambiguity of purpose 
in the choice of averages. 

Jevons 18 introduced the geometric mean into practice, to obtain 
an index number for prices. He was much impressed with its im- 
portance, considering that "in almost all the calculations of statistics 
and commerce the geometric mean ought, strictly speaking, to be 
used." Galton 19 concluded similarly: "in short, sociological phe- 
nomena, like vital phenomena are, as a general rule, subject to the 
condition of the geometric mean." Unfortunately natural phenomena 
are not so ideally simple, upon either a geometric or an arithmetic 
scale. In many cases variation can be best regarded as the product 
of factors, and then proportional changes appear natural. Thus we 
consider the effect of environment upon an object, or of the sur- 
roundings upon a system. But in general the causes determine the 
magnitude of the variate neither by their sum nor by their product. 

The choice of average is therefore a matter of practical compro- 
mise. When the purpose in hand is sufficiently definite, the suitable 
average is often very complicated. Economists have carried the 
theory farthest, in their search of an index number for prices. Re- 
cent studies by Walsh and Fisher 20 have shown that the geometric 
mean of two weighted arithmetic means is required. Can there be 
any doubt that in most cases the choice of the arithmetic mean rests 
merely on the desire for simplicity? 

A change of variable often introduces simplicity in scientific laws. 
For example, the change from cartesian to spherical or cylindrical 
coordinates, from cordinates to "generalized coordinates" and from 
velocities to momenta in mechanics, from centigrade to absolute 
temperature, from pressure, temperature and volume to "correspond- 
ing" pressure, etc., in the theory of corresponding states, and from 
wave-length to frequency in optics. Of course, these theories assume 
perfect causality, so that fixing all variables except one is supposed 

" W. S. Jevons, Fall in Value of <?ota,1863, Principles of Science (Am. Ed.) 
p. 419, 1873. 

" F. Galton, Proc. B. S., 29, 365-76, 1879; ibid., 40, 42-73, 1886. 
80 Irving Fisher, Am. Stat. Assn. Quarterly, March, 1921. 
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to determine exactly the value for that variable. But in nature 
variatesare merely correlated, and "natural quantities" are actually 
averages of distributed variates. Now the question arises: In 
which variable shall we compute the average? In other words: 
What average shall we use? For if (A) denote the average of the 
variable (X) corresponding to the simple average (x) of the variable 
x m f (X), we have 

fU)-jf2fW (25) 

Taking the arithmetic mean of the observed character is no solution, 
for who shall decide which variate shall be observed? Bertrand 21 
has remarked that there is no a priori reason for expecting that one 
function rather than another of the variate should vary by simple 
chance; and Pearson has more recently shown that the normal law 
is really too simple to describe the vast majority of frequency dis- 
tributions. The conventional character of scientific method is ob- 
vious. No general rule can be given for choosing the appropriate 
average. The criterion must be convenience. For specific purposes 
certain conditions may decide the question, as the equation of ex- 
change requires particular properties in an index number for prices. 
If experience is wide enough, however, probably no single average 
will suffice for all purposes. 

11. NORMAL GEOMETRIC VARIATION 

The consideration of change of variable in statistics leads to in- 
teresting questions regarding the frequency distributions arising 
from random sampling. To every type of variation of the variates 
corresponds an unlimited number of different types for the various 
functions of the observed variable. Pearson has shown that the 
first four moments, giving the mean, variability, skewness, and 
kurtosis respectively, are sufficient for describing actual frequencies 
within the errors of random sampling. By change of variable, 
however, it may be possible in certain cases to reduce the order of 
such freqnency types. Pearson 22 has justly opposed this method as 
a general solution of the problem of random sampling, but it can be 
considered as a part of his system when the change of variable is 

21 J. Bertrand, Calcul des Probability, p. 180, Gauthier Villars, Paris, 1889. 
« Karl Pearson, Biometrika, 4, 169-212, 1905-1906. 
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also appropriate for other reasons. Now the variation of the bac- 
terial index is actually found to be more nearly normal than is the 
distribution of the number of bacteria. Indeed Wolman's and Hous- 
ton's results are strikingly so. 

It is therefore of particular interest to bacteriologists to be familar 
with the properties of that simple type of chance variation which 
occurs in the variate (X) when the variation in the index (a;) is 
normal, or the law of error. Thus if (dN = Nydx) represents the 
number of variates between (x) and (x -\- dx), normal frequency is 
defined by 

-^S"[-*(^r)'\ <26) 

and the only constant is the standard deviation a x = N 2 (a; — x) 2 . 
Placing 7 ss exp(c 2 <r x 2 ), where c = In 10 enters because common 
logarithms are used, the corresponding X-distribution is given by 

Y = «i!^ exp [- 2 vi ( iog £) ] (27) 

This may be called normal geometric variation. The geometric mean 
(X — 7 X m ) is the median, one half the variates being less, one 
half greater. It has never before been noticed that for small varia- 
bilities (V) Pearson's empirical rule 

Mean — Mode = 3 (Mean — Median) 

can be deduced for such variation. For the exact relation X 8 = X 
X m reduces to 

X - X m = 3 (1 - ^F 2 ) (X - X) (28) 

where (X) is the arithmetic mean and (X m ) is the mode, or most 
probable value. 

The standard deviation of the index (<r x ), or its equivalent (7), 
is the only constant of normal geometric variation. The variability 

Wis ' _ 

Vm<r x /x=- Vy-1 (29) 

and Pearson's constant (Vj3i) which Bowley adopts as the skewness 
(k) is 

« - Vfc - V*l«l = V (7 + 2) (30) 

where (o- x ) is the standard deviation and (n») is the third moment 
about the mean (X). 
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Normal geometric variation has a simple theoretical interpretation. 
It arises when proportional variations are distributed at random 
without correlation. Equal proportions in excess and defect are 
therefore equally probable. The fact that this chance distribution 
gives Pearson's empirical rule suggests that the logarithm could be 
used to advantage as a variable in many cases of skew variation, and 
particularly in the field of correlation. 23 

In bacteriology every change in the environment produces its 
effect upon the whole bacterial population, so that proportional 
variation seems only natural. Moreover, wherever characters are 

TABLE 8 



Number of samples (N) 

Variability (V) per cent 

Skewness (per cent) 

Unit, Bacteria per liter 

Observed mode (X m ) 

Normal geometric mode 

Observed median (Xj) 

Observed geometric mean (X) 

Normal geometric mean (from X, <fx) ■ 
Observed arithmetic mean (X) 



POTOMAC RIVEIl 


Cedar 
Point 


Giesboro 
Point 


938 


294 


160 


150 


35 


50 


xio-» 


xio-» 


10-20 


20-30 


8 


7 


19 


43 


21 


40 


18 


52 


34 


93 



WASHINGTON 
WATER SUPPLY 



916 

230 

380 

Turbidity (p.p.m.) 

10-20 

6 

40 

51 

59 

150 



discrete magnitudes, where zero practically never occurs, but where 
very large variates are fairly frequent the skewness is quite generally 
positive, and then the logarithm is distributed more nearly normally 
than is the variate itself. But it must not be forgotten that the 
variation of the bacterial index is only rarely normal within the errors 
of random sampling. Pearson's first order curve (type III) is usually 
a better fit even for the index, and in some extreme cases the skewness 
is even more marked. On the whole, however, it cannot be questioned 
that experience favors the use of the index for bacterial distributions, 
that the geometric mean is usually a fairly close approximation to 
the median number of bacteria, and that the average index is a fair 
approximation to both the median and the most probable index. 

" Yule and McEwen have both remarked this advantage. Yule, Theory of 
Statistics, p. 202, 3rd ed. Griffin, London, 1916. G. F. McEwen and E. L. 
Michael, Proc. Am. Ac. Arts and Sci., 55, no. 2, p. 129, 1919. 
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The following examples show the time distributions at two points 
in the Potomac River, 24 and are compared with the variation of 
turbidity in the Washington water supply, 26 to illustrate the nature 
of such distributions. 

The chief use of a frequency distribution is in predicting the mag- 
nitude of deviations to be expected in future sampling. Stein has 
used for this purpose Tchebycheff's criterion, which states that a 
deviation from the mean greater than (T) times the standard devia- 
tion (<t) occurs less than once in (T 2 ) times. It is practically useless 
for normal variation, but is somewhat better for normal geometric 
variation, as is shown in table 9. 

TABLE 9 

Frequency of positive deviations exceeding To 





tchebycheff's 
criterion 

less than one 

IN 


NORMAL 

DEVIATION 

ONE IN 


NORMAL GEOMETRIC VABIATION 


T 


V - 10 per 
cent 
one in 


V = 50 per 

cent 

one in 


V - 100 per 
cent 
one in 


1 
2 
3 
4 
5 


2 

8 

18 

32 

50 


6 

43 

740 

31,500 


5 

24 

175 

1,140 

26,700 


3 

9 

21 

47 

104 


5 

14 

33 

68 

136 



It is hoped that enough advantages have been demonstrated in the 
bacterial index to convince bacteriologists of its propriety. We have 
no doubt that if once used, its practical convenience will lead to its 
adoption as the standard bacterial index. If bacterial indices are 
chosen as constituents of any more comprehensive index of water 
quality, 26 the component indices must be stable or they will have 
undue and erratic influence upon such a quality index. The stand- 
ardization of the bacterial index is therefore a necessary preliminary 
in any more ambitious program. 



12. SUMMARY 

1. The paper is a statistical study of the advantages of the loga- 
rithm of the number of bacteria per liter as a bacterial index. 

24 U. S. Hygienic Laboratory Bulletin, no. 104, p. 84, 1916. 
*» W. F. Wells, Proc. Am. Water Works Assn., 353-63, June, 1913. 
" See discussion of this question by A. Wolman (this Journal, 6, 444-156, 
1919). 
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2. Poisson's "law of small numbers" is applied to the dilution 
method, assuming the random samples to be independent (uncor- 
related). 

3. The simplest dilution scale is formed by assigning serial num- 
bers 1, 2, 3, 4, etc., to inoculations containing 100 cc, 10 cc, 1 cc, 
0.1 cc, etc. The "dilution positive" is the highest dilution in which 
bacteria are present. 

4. The average dilution positive, with the constant 0.24 added, 
gives the average bacterial index. The standard error of sampling 
of a single tube is 0.55, corresponding to a variability in the number 
of bacteria of 200 per cent. 

5. The dilution positive method is shown to be a simple and logical 
system of weighting the average, the constant correction 0.24 
resulting from the dilution interval chosen. 

6. When the tests are incomplete, so that in the largest inoculations 
less than 25 per cent of the tubes are positive, the most probable 
bacterial index is simply the logarithm of the sum of the percentages 
positive in dilutions 1, 2, and 3, minus 1.05; or of dilutions 2, 3, 4, 
minus 0.05, etc. But complete tests are recommended, whenever 
possible. 

7. When plates are counted the fluctuations of sampling are un- 
important, being less than 30 per cent for more than 10 colonies per 
plate. The disadvantage of taking logarithms is more than com- 
pensated by the greater stability of the average index, for propor- 
tional errors of measurement are larger in the larger variates, which 
are subject to more dilutions. Significant tendencies can be recog- 
nized in the average index with fewer samples than when the average 
number of bacteria is taken. 

8. Bacterial distributions often approach the normal form when 
the index is used as variable. The geometric mean number of bac- 
teria (the antilogarithm of the average index) is therefore a fair 
approximation to the median number. 

9. Incidentally it is noticed that Pearson's empirical rule: Mean 
+ Mode = 3 (Mean — Median) holds for normal geometric 
variation. 

10. The choice of average, as of observed character, is arbitrary, 
a matter of convenience. From every point of view the geometric 
mean number and the average index are the most appropriate 
expressions for results in bacteriology. 



